Production I/O Characterization on the Cray XE6

نویسندگان

  • Philip Carns
  • Yushu Yao
  • Kevin Harms
  • Robert Latham
  • Robert Ross
  • Katie Antypas
چکیده

I/O performance is an increasingly important factor in the productivity of large-scale HPC systems such as Hopper, a 153,216 core Cray XE6 system operated by the National Energy Research Scientific Computing Center. The scientific workload diversity of such systems presents a challenge for I/O performance tuning, however. Applications vary in terms of data volume, I/O strategy, and access method, making it difficult to consistently evaluate and enhance their I/O performance. We have adapted the Darshan I/O characterization tool for use on Hopper in order to address this challenge. Darshan is an I/O instrumentation library that collects I/O access pattern information from large-scale production applications with minimal overhead. In this paper we present our experiences in deploying Darshan on the Cray XE6 platform, including performance evaluation of Darshan with up to 98,304 processes and a case study of how to identify applications that can benefit most from I/O performance tuning. Darshan was automatically enabled for all Hopper users in November 2012 and instruments over 5,000 jobs per day as of April 2013.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Trillion Particles , 120 , 000 cores and 350 TBs : Lessons Learned from a Hero I / O Run on Hopper *

Modern petascale applications can present a variety of configuration, runtime, and data management challenges when run at scale. In this paper, we describe our experiences in running VPIC, a large-scale plasma physics simulation, on the NERSC production Cray XE6 system Hopper. The simulation ran on 120,000 cores using ∼80% of computing resources, 90% of the available memory on each node and 50%...

متن کامل

Tuning Parallel I/O on Blue Waters for Writing 10 Trillion Particles

Large-scale simulations running on hundreds of thousands of processors produce hundreds of terabytes of data that need to be written to files for analysis. One such application is VPIC code that simulates plasma behavior such as magnetic reconnection and turbulence in solar weather. The number of particles VPIC simulates is in the range of trillions and the size of data files to store is in the...

متن کامل

Characterizing I/O Performance Using the TAU Performance System

TAU is an integrated toolkit for performance instrumentation, measurement, and analysis. It provides a flexible, portable, and scalable set of technologies for performance evaluation on extreme-scale HPC systems. This paper describes alternatives for I/O instrumentation provided by TAU and the design and implementation of a new tool, tau_gen_wrapper, to wrap external libraries. It describes thr...

متن کامل

A File System Utilization Metric for I/O Characterization

A high performance computing (HPC) platform today typically contains a scratch high-performance parallel file system for data storage. Today, such file systems encompass 10-20% of the purchase price of a HPC resource. Looking forward, it is apparent that the rate of increase of hard drive performance will not keep up with the expected gains in processing, and therefore any effort to keep I/O pe...

متن کامل

Transitioning Users from the Franklin XT4 System to the Hopper XE6 System

The Hopper XE6 system, NERSC’s first peta-flop system with over 153,000 cores has increased the computing hours available to the Department of Energy’s Office of Science users by more than a factor of 4. As NERSC users transition from the Franklin XT4 system with 4 cores per node to the Hopper XE6 system with 24 cores per node, they have had to adapt to a lower amount of memory per core and onn...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013